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ABSTRACT 

We have publicly released a blinded mix of simulated SNe, with types (la, lb, Ic, II) selected in 
proportion to their expected rate. The simulation is realized in the griz filters of the Dark Energy 
Survey (DES) with realistic observing conditions (sky noise, point spread function and atmospheric 
transparency) based on years of recorded conditions at the DES site. Simulations of non-la type SNe 
are based on spectroscopically confirmed light curves that include unpublished non-la samples donated 
from the Carnegie Supernova Project (CSP), the Supernova Legacy Survey (SNLS), and the Sloan 
Digital Sky Survey-II (SDSS-II). We challenge scientists to run their classification algorithms and 
report a type for each SN. A spectroscopically confirmed subset is provided for training. The goals 
of this challenge are to (1) learn the relative strengths and weaknesses of the different classification 
algorithms, (2) use the results to improve classification algorithms, and (3) understand what spectro- 
scopically confirmed sub-sets are needed to properly train these algorithms. The challenge is available 
at www.hep.anl.gov/SNchallenge, and the due date for classifications is May 1, 2010. 
Subject headings: supernova light curve fitting and classification 
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1. MOTIVATION 

To explore the expansion history of the universe, in- 
creasingly large samples of high quality SNe la light 
curves are being used to measure luminosity distances as 
a function of redshift. With increasing sample sizes, there 
are not nearly enough resources to spectroscopically con- 
firm each SN. Currently, the world's largest samples are 
from t he Supernova Legacy Survey (SNLS: lAstier et al.l 
(l2006h) and the Slo an Digital Sky Survey-II (SDSS-II: 
iFrieman et"aT1 (|2008ft ). each with more than 1000 SNe la, 
yet less than half of their SNe are spectroscopically con- 
firmed. The numbers of SNe are expected to increase 
dramatically in the coming decade: thousands fo r the 
Dark Energy Survey (DES: iBernstein et al.l (|2009| )) and 
a few hundred thousand for the Panoramic Survey Tele- 
scope and Rapid Response System (Pan-ST ARRS)B and 
the La r ge Synoptic Survey Telescop e (LSST: llvezic et all 
([20081 ): ILSST Science Book! IpOOl L Since only a small 
fraction of these SNe will be spectroscopically confirmed, 
photometric identification is crucial to fully exploit these 
large samples. 

In the discovery phase of accelerated cosmologi- 
cal expansion, results were based on tens of high- 
rcdshift SNe la, and some samples included a signif- 
icant fraction of e vents that t hat were not classified 
from a spectrum (iRiess et al.l 11998; IPerlmutter et al.1 



photometric classification hav e been developed over the 
past d e cade: iPoznanski et al.l (I2002D : iDahlen k, Goobarl 
12001: iSullivan et al.1 (12006ft: I Johnson fc Crottsl ( | 2006|j : 
Poznanski et al.1 (l2007fl: iKuznetsova fc Connolly! (|2007t ): 
Rodney fc Tonrvi <|2009f ) . Some of these methods have 



[TcjM iTonrv et al.l 120031 : IRiess et al.M2004n . While hu- 
man judgment played a significant role in classifying 
these SNe without a spectrum, more formal methods of 
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been used to select candidates for spectroscopic follow- 
up observations, but these methods have not been 
used to select a significant photometric SN la sam- 
ple for a Hubble diagram analysis. In short, cosmo- 
logical parameter estimates from much larger and re- 
cent surveys are based solely on sp e ctrosc opically con- 
firmed SNe la (SNLS: lAstier et al.1 (l2006l). ESSENCE : 
IWood-Vasev~et~aTI ([2007^ . CSP: iFreedman et al.1 (l200l . 
SDSS-II: IKessler et al.1 (|2009h l. 

The main reason for the current reliance on spectro- 
scopic identification is that vastly increased spectroscopic 
resources have been used in these more recent surveys. 
In spite of these increased resources, more than half of 
the discovered SNe do not have a spectrum and there- 
fore photometric methods will eventually be needed to 
classify the majority of the SNe. There are two difficul- 
ties limiting the application of photometric classification. 
First is the lack of adequate non-la data for training al- 
gorithms. Many classification algorithms were developed 
using non-la template^ constructed from averaging and 
interpolating a limited amount of spectroscopically con- 
firmed non-la data, and therefore the impact of the non- 
la diversity has not been well studied. The second diffi- 
culty is that there is no standard testing procedure, and 
therefore it is not clear which classification methods work 
best. 

To aid in the transition to using photometric SN- 
classification, we have released a public "SN Photomet- 
ric Classification Challenge" to the community, hereafter 
called SNPhotCC. The SNPhotCC consists of a blinded mix 
of simulated SNe, with types (la, lb, Ic, II) selected in 
proportion to their expected rate. The challenge is for 
scientists to run their classification algorithms and re- 

7 http: //supernova. lbl .gov/nugent/nugent .templates .html 
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port a type for each SN. A spcctroscopically confirmed 
sub-set is provided so that algorithms can be tuned with 
a realistic training set. The goals of this challenge are 
to (1) learn the relative strengths and weaknesses of the 
different classification algorithms, (2) use the SNPhotCC 
results to improve the algorithms, and (3) understand 
what spectroscopically confirmed sub-sets are needed to 
properly train these algorithms. 

To address the paucity of non-la data, the CSP, SNLS 
and SDSS-II have contributed unpublished spectroscopi- 
cally confirmed non-la light curves. These data are high- 
quality multi-band light curves, not just junk that no- 
body cares about, and therefore we are grateful to the 
donating collaborations. This non-la sample is likely to 
undcrsample the potential variety in the upcoming sur- 
veys like DES and LSST, but we anticipate that this chal- 
lenge will be a useful step away from the overly-simplistic 
studies that have relied on a handful of non-la templates. 

The outline of this release-note is as follows. A descrip- 
tion of the simulation is given in |j2j and instructions for 
participants are in Ej3j Comments on the evaluations and 
posting of results are given in $4] 

2. THE SIMULATION 

The simulation is realized in the griz filters of the 
Dark Energy Survey (DES). The sky- noise, point-spread 
function and atmospheric transparency are evaluated in 
each filter based on years of observational data from the 
ESSENCE project at the Cerro Tololo Inter-American 
Observatory (CTIO). For the five SN fields (3 sq deg 
each), the cadence is based on allocating 10% of the 
DES photometric observing time and most of the non- 
photometric time. The cadence used in this publicly 
available simulation was generated by the Supernova 
Working Group within the DES collaboration^ Since 
the DES plans to collect data during 5 months of the 
year, incomplete light curves from temporal edge effects 
are included; i.e., the simulated explosion times extend 
well before the start of each survey season, and extend 
well beyond the end of the season. 

Simulated SNe la are based on models empirically de- 
rived from data. In addition to the model parameters, 
we have applied tweaks to simulate the anomalous Hub- 
ble scatter. While these tweaks are invented ad-hoc, 
they have not been ruled out with current observations. 
Simulated non-la SNe are based on observed multi-color 
light curves (from CSP, SNLS, and SDSS) that have been 
smoothed in each passband, and then K-corrected to the 
appropriate redshift and filters. 

A spectroscopically confirmed subset is based on ob- 
servations on a 4 meter class telescope with a limiting r- 
band magnitude of 21.5, and an 8 meter class telescope 
with a limiting z-band magnitude of 23.5. The subset 
is randomly selected, and the number of spectroscopi- 
cally confirmed SNe (~ 1000) corresponds to the com- 
bined resources of the SNLS & SDSS-II surveys. While 
this number of spectroscopic identifications may be opti- 
mistic, this allows for further study on how the training 
quality depends on the size of the spectroscopic sample. 

8 Although two of us (RK & SK) are members of the DES, we 
have not included other DES colleagues in any discussions about 
this challenge, and we have made our best efforts to prevent our 
DES collaborators from obtaining additional information beyond 
that contained in this note. 



For the challenge that includes the host-galaxy photo- 
metric redshift, the photo- z estimates are based on sim- 
ulated galaxi e s (for D ES) analyzed with the methods in 
lOvaizu et afl (|2008al lbh. The average host-galaxy photo- 
z resolution is 0.03. 

Two simple selection criteria have been applied. First, 
each object must have at least one observation with a 
signal to noise ratio (S/N) above 5 (in any filter). Sec- 
ond, there must be at least 5 observations after explosion, 
and there is no S/N requirement on these observations. 
These requirements are relatively loose because part of 
the challenge is to determine the optimal selection crite- 
ria. The total number of simulated SNe that satisfy these 
loose selection requirements is 2 x 10 4 , and corresponds 
to the 5 seasons planned for the DES. 

3. TAKING THE CHALLENGE 

Two independent challenges have been generated: one 
with a host-galaxy photo-z, and another without any red- 
shift information. In addition to these challenges based 
on the entire light curve, there is also an early-epoch 
challenge based on the first six observations (in any fil- 
ter) with S/N > 4. On the night of the sixth observation, 
all observations made this night are included. Among the 
four challenges available, you may take any of them or 
all of them. 

The simulated light curves can be downloaded from 
the SNPhotCC website^ The filter response functions 
are given in the files DES_[griz] .dat. The file with the 
" . LIST" suffix provides a list of all data files to analyze. 
The data files are self-documented and visual inspection 
should be adequate for preparing a parsing algorithm. 
The calibrated fluxes are defined as 

FLUXCAL = io(- a4 - m + u ) + noise (1) 

where m is the modeled AB-magnitudc of the SN, and 
the noise contributions^ include Poisson fluctuations, 
sky noise, and CCD noise. The observed magnitudes are 
not provided because they are not defined when noise 
fluctuations result in a negative flux; for fitting, we rec- 
ommend translating model-magnitudes into fluxes as de- 
fined in Eq. Q] 

For tuning your algorithms, the spectroscopically con- 
firmed sub-sample is identified by the SNTYPE keyword 
(see Table [T]), and the corresponding redshift is given 
by the REDSHIFT_SPEC keyword. For the majority of 
SNe that do not have spectroscopic identification, the 
type and spectroscopic redshift are set to —9. For the 
host-galaxy photo-z sample, the photo-z is given by 
the H0ST_GALAXY_PH0T0-Z keyword. For the early-epoch 
challenge, process only the observations that appear be- 
fore the "DETECTION : " keyword. 

A valid challenge submission must contain three items: 
(1) an answer list containing the type for each SN, (2) 
a brief description of your method, and (3) an estimate 
of the CPU resources. For a group effort, a team name 
is recommended. These submission items are discussed 
below in more detail. 

For each challenge that you participate in, your answer 
list must contain four columns: 

9 www.hep.anl.gov/SNchallenge 

10 The noise has been scaled from photoelectrons into FLUXCAL 
units. 
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TABLE 1 
Integer codes for SN types. 





integer 


SN-type 


code 


la 


1 


II (Iln, IIP, IIL) 


2 (21, 22, 23) 


Ibc (lb, Ic) 


3 (32, 33) 


other 


66 


rejected 


-1 



SNID TYPE PHOTOZ PH0T0Z_ERR0R 

where 

• SNID is the SN integer id 

• TYPE is the integer SN-type code returned by your 
classifier (see Table [I]). You can report either a 
general type (1,2,3 for la, II, Ibc), or a specific sub- 
type. 

• PHQTOZ is photo- z value returned by your classifier. 

• PH0T0Z_ERR0R is the uncertainty 

If your code does not return a useful photo- z value, just 
set —9 in the last two columns. A valid answer list must 
contain entries in all four columns and for each SN; in- 
valid answer files will be returned. In addition to the 
answer file, please provide a brief description of your 
technique. A reference to either a refereed journal ar- 
ticle or arXiv posting is adequate, but please describe 
any modifications from the referenced article. Finally, 
include the processing time, the number of light curves 
analyzed (i.e, that are not rejected by selection cuts) and 
a description of your computing processor hardware. 

In addition to thinking about your classification algo- 
rithm, you should also think about appropriate selection 
cuts to reject SNe that are difficult to classify. Set the SN 
type to —1 for rejected SNe. As described in 21 our eval- 
uation generally penalizes incorrect classifications more 
than it penalizes the loss from selection cuts. 

To maximize the utility of this challenge, please re- 
spect the following guidelines. While you can use the 
spectroscopically confirmed subset to train your algo- 
rithms, please use your program to report classifications 
from this subset; i.e, do not just report the spectro- 
scopic SN type. A useful diagnostic in the evaluation 
will be to compare the classification performance from 
the training subset to that from the rest of the sample. 
In a similar spirit, do not use the spectroscopic redshift 
(REDSHIFTJ3PEC) to report classifications. Finally, for 
the early-epoch challenge, use only the spectroscopically 
confirmed sub-sample for tuning your algorithms; i.e., do 
not use the full set of (unconfirmed) light curves. 

Don't hesitate to report problems or suggestions, in- 
cluding methods for evaluation. Missing information and 
updates will be appended to $5] and re-posted to the 
arXiv. You should periodically check this arXiv post- 
ing for updates. 

Finally, the due date is May 1, 2010. 



4. POSTING & EVALUATING THE CHALLENGE RESULTS 

Classification results from the participants will be 
posted publicly along with our initial evaluations and 
the answer key. Anyone can therefore evaluate the al- 
gorithms using their choice of figure-of- merit (FoM). We 
will also provide additional information about the simu- 
lation strategy, along with details for each simulated SN. 
For non-la type SNe based on K-correcting unpublished 
light curves, the level of detail that we release will be de- 
termined solely by the donating collaborations. Shortly 
before posting the answer key, we will ask the donating 
collaborations for instructions on what details can be re- 
leased. 

We finish with a discussion of ideas on how to evalu- 
ate the results. Ideally, we would like to assign a single 
number (FoM) for each algorithm. To make more refined 
comparisons, the FoM can be tabulated as a function of 
redshift or any other variable of interest. 

We begin the discussion by considering the FoM for a 
la rate measurement based on photometric identification. 
After selection requirements have been applied, let N^ nc 
be the number of correctly typed SNe la, and N^ lse be 
the number of non-la that are incorrectly typed as an 
SN la. A simple classification FoM is the square of the 
signal-to- noise ratio (S/N) divided by the total number 
of SNe la (A/"™ 1 ") before selection cuts, 

_ l (N£*l 

L-FoM-Ia — j^j-TOT X jytruc , jyfalse^yfalse 
la la la la 

= e Ia x [Ntr/Wr + w£ lsc N{t c )\ , (2) 

where Wi^ sc is the false-tag weight (penalty factor) de- 
scribed below, eia is the SN la efficiency that includes 
both selection and typing requirements, and Nj* uc = 
e iaA/' I TOT - Since A/^ OT is a constant that is indepen- 
dent of the analysis, we have divided out this term so 
that < CpoM-ia < lj with Cp M-ia = 1 corresponding 
to the theoretically optimal analysis. 

The FoM in Eq. [2] is the product of two terms. The 
first term is the efficiency for selecting and classifying 
type la SNe, and the second term is the la purity (when 
Wia = 1)' t ne fraction of classified la that really are 
SNe la. In the ideal case where the average of iV Ia L is 
perfectly determined, W(^ sc = 1 and the naive statisti- 
cal uncertainty is the only contribution to the FoM. In 
practice, uncertainties in determining the false-tag rate 
lead to W(f se > 1. For example, suppose that N[f sc 
is scaled from a spectroscopically confirmed subset con- 
taining a fraction (e spe c) of the total number of SNe; in 
this case, W£ lsc = 1 + l/e S pcc is much larger than 1 if 
the spectroscopic subset is small. It may be possible to 
reduce W(^ sc using other methods to determine N^ lsc , 
such as fitting the tails in the distance-modulus residu- 
als. For SN-cosmology applications, a proper determina- 
tion of Wj f ^ lse is beyond the scope of this classification 
challenge, but suggestions are welcome on setting an ap- 
propriate value for the evaluations. 

Next we illustrate the FoM with a numerical example 
in which the false-tag rate is determined from a spec- 
troscopic sub-sample with e spcc = 0.2, and Wj f ^ lsc = 6. 
Consider a sample with 50% type la and 50% non-la, 
and e spC c = 0.2. Assume that the classification algo- 
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rithm correctly identifies half of the SNe, while for the 
other half the classification works so poorly that it is 
equivalent to making random guesses with a 50% prob- 
ability of guessing correctly. If the ambiguous half is 
rejected, then e\ a = 0.5, the purity term is 100% (since 
N(f sc = 0), and C^M-ia = 0.5. Now consider an analysis 
strategy without selection requirements. The efficiency 
term increases to ei a = 75% since 25% of the SNe la are 
rejected by incorrect classifications. However, since the 
false-classification rate increases to iVf^ se /iVj£ ue = 1/3 
the purity term drops to 1/(1 + 6-1/3) = 1/3 and 
the net FoM drops to CFoM-ia = 1/4. An algorithm 
that simply makes a random guess on all SNe results in 
CFoM-ia — 1/14. The point of this exercise is to illustrate 
the importance of selection criteria, and that forcing a 
classification on every SN candidate is not necessarily 
the optimal strategy. 

5. POST-RELEASE UPDATES 

• February 7, 2010: for the spectroscopically con- 
firmed subset, sub- types are given as indicated in 
Table [T] Participants can either report a general 
classification (i.e., 1,2,3 — > Ia,II,Ibc) or report a 
specific sub-type (e.g., Iln, Ic, etc.). Download the 
updated challenge data files only if you need the 
sub- types. 



• March 14, 2010: Fixed bug in which about 1% 
of the SNe have pathological late-time magnitudes. 
Download data files after date-stamp above. 

• March 24, 2010: Fixed bug in which a few dozen 
non-la SNe have pathological magnitudes at all 
epochs. 

• April 13, 2010: Fixed two bugs related to type II 
SNe. First, the wrong redshift was mistakenly used 
for one of the observed IIP, resulting in a 2 mag 
overestimate of its brightness. Second, for another 
type II SN the absolute mag was mistakenly set 0.3 
mag too bright. While the generated fraction of 
these buggy SNe was small, their contribution to 
the challenge sample after requiring S/N> 5 was 
relatively large; therefore the updated sample has 
- 1400 fewer SNe. 

• April 27, 2010: No bug-fixes, but we have de- 
cided to to fix VF/ a alsc = 3 for the C Fo M-ia cal- 
culation, and allow participants to optimize ac- 
cordingly. Also, to help check for buggy submis- 
sions, please include your evaluation of the la- 
purity and la-efficiency for the spectroscopically 
confirmed subset. 
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